Section: Scientific Foundations
Model Evaluation from Experimental Data
This point can be considered as central in the DIGIPLANTE project.
Parametric Identification : estimation of model parameters and evaluation of estimation uncertainty
-
Theory and methods. The parametric estimation of the GreenLab model was classically done with estimators of the Generalized Least Squares type. It assumed a diagonal covariance matrix for the errors of the model outputs. If it has provided efficient algorithms to determine parameters ensuring a satisfying goodness-of-fit, it proved restrictive to study model robustness and determine confidence intervals for the parameters, which is crucial for applications. A new perspective is currently studied to improve estimation and uncertainty characterization. An equivalent description of the dynamic system in the framework of hidden (latent variable) models was formulated (cf. Cournède et al., 2011). Statistical estimation in this framework can be tackled with tools borrowed from the theory of hidden Markov models, such as maximum likelihood estimation. In our case, the associated likelihood function cannot be computed in a closed form. Simulation based methods are in progress in order to implement proper stochastic versions of the EM algorithm and stochastic gradient methods for state and parameter estimation. In this direction, the class of sequential Monte-Carlo, particle filter and MCMC algorithms, can be used for maximum likelihood estimation and seems particularly adapted to our case. A collaboration with the Univ. of Patras is starting on this issue. The same type of methods can be used in Bayesian inference. It is also explored, for situations in which priors are easy to determine (study of genetic populations, data assimilation ...).
-
Application to real plants: this aspect has been one of the strong points of Digiplante: a wide variety of plants have been studied with the GreenLab model, always confronting the model to experimental data. This study will continue, with the double objectives of improving / validating / comparing models and testing our estimation methods. However, it is important to focus on the plants for which we have rich data sets, allowing a proper model validation (with training and testing data sets). The collaborations with ITB (for sugar beet), INRA-Grignon (for rapeseed), Supagro Montpellier (for Sunflower and Grapevine) Cirad-Guyana (for Cecropia) and China Academy of Forestry (for pine) are long-term partnerships that make it possible to get these good data sets on different types of plants, with different levels of difficulty.
Model selection
In plant growth modeling, it seems that each research group is developing its own model. It is thus crucial to compare, conceptually and mathematically, the existing models, in order to assess their differences and select the 'best' models regarding specific objectives. Therefore, several classical models (STICS, PILOTE, ADEL-NEMA, SUNFLO / CORNFLO ...) are also considered in IPANEMA beside the GreenLab model. Our objective is to test different selection criteria, particularly MDL (Minimum Description Length) in collaboration with L2S Supélec-CNRS and MSEP (mean-square error of prediction).
Optimization of experimental protocol for phenotyping
If we obtain a good estimation of the uncertainty in model parameters (that is the objective of the research axis described in 3.2.1 ), we will also be able to optimize the experimental protocols. This is particularly important in phenotyping for seed companies, that need to evaluate the performances of large numbers of new varieties each year. The optimization concerns the amount of data to collect in a given experimental situation, and the number of experimental situations (with respect to climatic scenarios). The PhD of Fenni Kang studies these questions, in collaboration with J. Lecoeur (Syngenta).
Data acquisition from aerial images and data assimilation
Using real data is the key to decrease model uncertainty. For this purpose, aerial (or satellite, or drone) images provide a very interesting source of information. A new researcher (Ingénieur Confirmé) in the group Corina Iovan is a specialist of image analysis for vegetation. The objective is to assimilate this data, in order to: